home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
ftp.cs.arizona.edu
/
ftp.cs.arizona.edu.tar
/
ftp.cs.arizona.edu
/
tsql
/
doc
/
tsql.mail
/
000104_kline _Mon May 3 00:35:43 1993.msg
< prev
next >
Wrap
Internet Message Format
|
1996-01-31
|
10KB
Received: from cheltenham.CS.Arizona.EDU by optima.CS.Arizona.EDU (5.65c/15) via SMTP
id AA24317; Mon, 3 May 1993 00:35:45 MST
Date: Mon, 3 May 1993 00:35:43 MST
From: "Nick Kline" <kline>
Message-Id: <199305030735.AA16352@cheltenham.cs.arizona.edu>
Received: by cheltenham.cs.arizona.edu; Mon, 3 May 1993 00:35:43 MST
To: tsql
Subject: updated valid-time aggregate defs
I've updated the definitions concerning valid-time partitioning in response
to several comments.
I eliminated the definitions for partitioning attribute and value partioning
since these are not temporal database aspects. They were include for
completeness before.
I have tried to more clearly explain the motivations behind having an
*associated interval* with each valid-time temporal element (TE).
I actually partition the time-line into TE's and associate with each TE an
interval. The reason for this association is two-fold:
1) it's useful for the subdivision of the valid time-line to be
a partitioning (in the mathematical sense)
2) it's very useful to allow overlapping intervals and this is excluded
by pt. 1 above
Defining the terms this way is very general, yet it allows succinct
definitions (excluding the long discussions!).
The revised definitions follow.
Please contact me with any correspondence.
Thanks,
Nick Kline
kline@cs.arizona.edu
% Document Type: LaTeX
\documentstyle[11pt]{article}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% VARIOUS MACROS
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\long\def\comment#1{}
\newcommand{\entry}[1]{\subsubsection*{#1}}
\addtolength{\textwidth}{1.485in}%{1.2in}
\setlength{\oddsidemargin}{.1in}%{.3in}
\setlength{\evensidemargin}{.1in}%{.3in}
\addtolength{\topmargin}{-.85in} %{-1.35in}
\addtolength{\textheight}{1.8in} %{2.8in}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% PAPER START
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\begin{document}
\subsection{Valid-time Partitioning}
{\em Valid-time partitioning} is the partitioning (in the mathematical
sense) of the valid time-line into {\em valid-time elements}. For
each valid-time element, we associate an interval of the valid
time-line on which a cumulative aggregate may then be applied.
\entry{Alternative Names}
Valid-time grouping.
\entry{Discussion}
To compute the aggregate, first partition the time-line into
valid-time elements, then associate an interval with each valid-time
element, assemble the tuples valid over each interval, and finally
compute the aggregate over each of these sets. The value at any event
is the value computed over the partitioning element that contains that
event.
The reason for the {\em associated} interval with each temporal
element is that we wish to perform a {\em partition} of the valid
time-line, and not exclude certain queries. If we exclude computing
the aggregate on overlapping intervals, we exclude queries such as
``Find the average salary paid for one year before each hire.'' Such
queries would be excluded because the one-year intervals before each hire
might overlap.
Partitioning the time-line is a useful capability for aggregates in
temporal databases (+R1,+R3).
Grouping is inappropriate because the valid-time elements form a true
partition; they do not overlap and must cover the time line.
However the associated intervals may be defined in any way.
One example of valid-time partitioning is to divide the time-line into
years, based on the Gregorian calendar. Then for each year, compute
the count of the tuples which overlap that year.
There is no existing term for this concept. There is no partitioning
attribute in valid-time partitioning, since the partitioning does not
depend on attribute values, but instead on valid-times.
Valid-time partitioning may occur before or after value partitioning.
\subsection{Dynamic Valid-time Partitioning}
In {\em dynamic valid-time partitioning} the valid-time elements used
in the partitioning are determined solely from the timestamps of the
relation.
\entry{Alternative Names}
Moving window.
\entry{Discussion}
The term dynamic is appropriate (as opposed to static) because if
the information in the database changes, the partitioning intervals
may change. The intervals are determined from intrinsic information.
One example of dynamic valid-time partitioning would be to compute the
average value of an attribute in a relation (say the salary
attribute), for the previous year before the stop-time of each tuple.
A technique which could be used to compute this query would be for
each tuple, find all tuples valid in the previous year before the
stop-time of the tuple in question, and combine these tuples into a
set. Finally, compute the average of the salary attribute values in
each set.
It may seem inappropriate to use valid-time elements instead of
intervals, however there is no reason to exclude valid-time elements.
Perhaps the elements are the intervals during which the relation is
constant.
The existing term for this concept does not have an opposing term
suitable to refer to static valid-time partitioning, and can not
distinguish between the two types of valid-time partitioning (-E3, +E9).
Various temporal query languages have used both dynamic and static
valid-time partitioning, but have not always been clear about which
type of partitioning they support (+E1). Utilization of these terms
will remove this ambiguity from future discussions.
\subsection{Static Valid-time Partitioning}
\entry{Definition}
In {\em static valid-time partitioning} the valid-time elements used are
determined solely from fixed points on a calendar, such as the start
of each year.
\entry{Alternative Names}
Moving window.
\entry{Discussion}
This term further distinguishes existing terms (-E3, +E9). It is an
obvious parallel to dynamic valid-time partitioning (+E1). Static is
an appropriate term because the valid-time elements are determined
from extrinsic information. The partitioning element would not
change if the information in the database changed.
Computing the maximum salary of employees during each month is an
example which requires using static valid-time partitioning. To compute
this information, first divide the time-line into valid-time elements
where each element represents a separate month on, say, the Gregorian
calendar. Then, find the tuples valid over each valid-time element,
and compute the maximum aggregate over the members of each set.
\subsection{Valid-time Cumulative Aggregation}
\entry{Definition}
In {\em cumulative aggregation}, for each valid-time element of the
valid-time partitioning (produced by either dynamic or static valid-time
partitioning), the aggregate is applied to all tuples associated with that
valid-time element.
The value of the aggregate at any event is the value computed over the
partitioning element that contains that event.
\entry{Alternative Names}
Moving window.
\entry{Discussion}
{\em Cumulative} is used because the interesting values are defined
over a cumulative range of time (+E8). This term is more precise than
the existing term (-E3, +E9). Instantaneous aggregation may be
considered to be a degenerate case of cumulative aggregation where the
partition is per chronon and the associated interval is that chronon.
One example of cumulative aggregation would be find the total number
of employees who had worked at some point for a company. To compute
this value at the end of each calendar year, then, for each year,
define a valid-time element which is valid from the beginning of time
up to the end of that year. For each valid-time element, find all
tuples which overlap that element, and finally, count the number of
tuples in each set.
\subsection{Instantaneous Aggregation}
\entry{Definition}
In {\em instantaneous aggregation}, for each chronon on the valid
time-line, the aggregate is applied to all tuples valid at that event.
\entry{Alternative Names}
None.
\entry{Discussion}
The term {\em instantaneous} is appropriate because the aggregate is
applied over every chronon, every event. It suggests an interest in
the aggregate value over a very small time interval, an instant, much
as acceleration is defined in physics over an infinitesimally small
time (+R3).
Many temporal query languages perform instantaneous aggregation, others
use cumulative aggregation, while still others use a combination of the two.
This term will be useful to distinguish between the various alternatives,
and is already used by some researchers (+R4,+E3).
\subsection{Gregorian Calendar}
\entry{Definition}
The {\em Gregorian calendar} is composed of 12 months, named in order,
January, February, March, April, May, June, July, August, September,
October, November, and December. The 12 months form a year. A year
is either 365 or 366 days in length, where the extra day is used on
``leap years.'' Leap years are defined as years evenly divisible by 4, with
centesimal years being excluded, unless that year is divisible by 400.
Each month has a fixed number of days, except for February, the length
of which varies by a day depending on whether or not the particular
year is a leap year.
\entry{Alternative Names}
None.
\entry{Discussion}
The Gregorian calendar is widely used and accepted (+E3,+E7). This term is
defined and used elsewhere (-R1), but is in such common use in
temporal databases that it should be defined.
\end{document}